122 research outputs found
Learning to Reconstruct Shapes from Unseen Classes
From a single image, humans are able to perceive the full 3D shape of an
object by exploiting learned shape priors from everyday life. Contemporary
single-image 3D reconstruction algorithms aim to solve this task in a similar
fashion, but often end up with priors that are highly biased by training
classes. Here we present an algorithm, Generalizable Reconstruction (GenRe),
designed to capture more generic, class-agnostic shape priors. We achieve this
with an inference network and training procedure that combine 2.5D
representations of visible surfaces (depth and silhouette), spherical shape
representations of both visible and non-visible surfaces, and 3D voxel-based
representations, in a principled manner that exploits the causal structure of
how 3D shapes give rise to 2D images. Experiments demonstrate that GenRe
performs well on single-view shape reconstruction, and generalizes to diverse
novel objects from categories not seen during training.Comment: NeurIPS 2018 (Oral). The first two authors contributed equally to
this paper. Project page: http://genre.csail.mit.edu
Pix3D: Dataset and Methods for Single-Image 3D Shape Modeling
We study 3D shape modeling from a single image and make contributions to it
in three aspects. First, we present Pix3D, a large-scale benchmark of diverse
image-shape pairs with pixel-level 2D-3D alignment. Pix3D has wide applications
in shape-related tasks including reconstruction, retrieval, viewpoint
estimation, etc. Building such a large-scale dataset, however, is highly
challenging; existing datasets either contain only synthetic data, or lack
precise alignment between 2D images and 3D shapes, or only have a small number
of images. Second, we calibrate the evaluation criteria for 3D shape
reconstruction through behavioral studies, and use them to objectively and
systematically benchmark cutting-edge reconstruction algorithms on Pix3D.
Third, we design a novel model that simultaneously performs 3D reconstruction
and pose estimation; our multi-task learning approach achieves state-of-the-art
performance on both tasks.Comment: CVPR 2018. The first two authors contributed equally to this work.
Project page: http://pix3d.csail.mit.ed
Visual Object Networks: Image Generation with Disentangled 3D Representation
Recent progress in deep generative models has led to tremendous breakthroughs
in image generation. However, while existing models can synthesize
photorealistic images, they lack an understanding of our underlying 3D world.
We present a new generative model, Visual Object Networks (VON), synthesizing
natural images of objects with a disentangled 3D representation. Inspired by
classic graphics rendering pipelines, we unravel our image formation process
into three conditionally independent factors---shape, viewpoint, and
texture---and present an end-to-end adversarial learning framework that jointly
models 3D shapes and 2D images. Our model first learns to synthesize 3D shapes
that are indistinguishable from real shapes. It then renders the object's 2.5D
sketches (i.e., silhouette and depth map) from its shape under a sampled
viewpoint. Finally, it learns to add realistic texture to these 2.5D sketches
to generate natural images. The VON not only generates images that are more
realistic than state-of-the-art 2D image synthesis methods, but also enables
many 3D operations such as changing the viewpoint of a generated image, editing
of shape and texture, linear interpolation in texture and shape space, and
transferring appearance across different objects and viewpoints.Comment: NeurIPS 2018. Code: https://github.com/junyanz/VON Website:
http://von.csail.mit.edu
Height Information Aided 3D Real-Time Large-Scale Underground User Positioning
Due to the cost of inertial navigation and visual navigation equipment and lake of satellite navigation signals, they cannot be used in largeāscale underground mining environment. To solve this problem, this study proposes largeāscale underground 3D realātime positioning method with seam height assistance. This method uses the ultrawide band positioning base station as the core and is combined with seam height information to build a factor graph confidence transfer model to realise3D positioning. The simulation results show that the proposed realātime method is superior to the existing algorithms in positioning accuracy and can meet the needs of largeāscale underground users
Time Reversal Aided Bidirectional OFDM Underwater Cooperative Communication Algorithm with the Same Frequency Transmission
In underwater acoustic channel, signal transmission may experience significant latency and attenuation that would degrade the performance of underwater communication. The cooperative communication technique can solve it but the spectrum efficiency is lower than traditional underwater communication. So we proposed a time reversal aided bidirectional OFDM underwater cooperative communication algorithm. The algorithm allows all underwater sensor nodes to share the same uplink and downlink frequency simultaneously to improve the spectrum efficiency. Since the same frequency transmission would produce larger intersymbol interference, we adopted the time reversal method to degrade the multipath interference at first; then we utilized the self-information cancelation module to remove the self-signal of OFDM block because it is known for sensor nodes. In the simulation part, we compare our proposed algorithm with the existing underwater cooperative transmission algorithms in respect of bit error ratio, transmission rate, and computation. The results show that our proposed algorithm has double spectrum efficiency under the same bit error ratio and has the higher transmission rate than the other underwater communication methods
Subclass-balancing Contrastive Learning for Long-tailed Recognition
Long-tailed recognition with imbalanced class distribution naturally emerges
in practical machine learning applications. Existing methods such as data
reweighing, resampling, and supervised contrastive learning enforce the class
balance with a price of introducing imbalance between instances of head class
and tail class, which may ignore the underlying rich semantic substructures of
the former and exaggerate the biases in the latter. We overcome these drawbacks
by a novel ``subclass-balancing contrastive learning (SBCL)'' approach that
clusters each head class into multiple subclasses of similar sizes as the tail
classes and enforce representations to capture the two-layer class hierarchy
between the original classes and their subclasses. Since the clustering is
conducted in the representation space and updated during the course of
training, the subclass labels preserve the semantic substructures of head
classes. Meanwhile, it does not overemphasize tail class samples, so each
individual instance contribute to the representation learning equally. Hence,
our method achieves both the instance- and subclass-balance, while the original
class labels are also learned through contrastive learning among subclasses
from different classes. We evaluate SBCL over a list of long-tailed benchmark
datasets and it achieves the state-of-the-art performance. In addition, we
present extensive analyses and ablation studies of SBCL to verify its
advantages
- ā¦